Information Dropout: learning optimal representations through noise

نویسندگان

  • Alessandro Achille
  • Stefano Soatto
چکیده

We introduce Information Dropout, a generalization of dropout that is motivated by the Information Bottleneck principle and highlights the way in which injecting noise in the activations can help in learning optimal representations of the data. Information Dropout is rooted in information theoretic principles, it includes as special cases several existing dropout methods, like Gaussian Dropout and Variational Dropout, and, unlike classical dropout, it can learn and build representations that are invariant to nuisances of the data, like occlusions and clutter. When the task is the reconstruction of the input, we show that the information dropout method yields a variational autoencoder as a special case, thus providing a link between representation learning, information theory and variational inference. Our experiments validate the theoretical intuitions behind our method, and we find that information dropout achieves a comparable or better generalization performance than binary dropout, especially on smaller models, since it can automatically adapt the noise to the structure of the network, as well as to the test sample.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Information Dropout: Learning Optimal Representations Through Noisy Computation

The cross-entropy loss commonly used in deep learning is closely related to the defining properties of optimal representations, but does not enforce some of the key properties. We show that this can be solved by adding a regularization term, which is in turn related to injecting multiplicative noise in the activations of a Deep Neural Network, a special case of which is the common practice of d...

متن کامل

Analyzing noise in autoencoders and deep networks

Autoencoders have emerged as a useful framework for unsupervised learning of internal representations, and a wide variety of apparently conceptually disparate regularization techniques have been proposed to generate useful features. Here we extend existing denoising autoencoders to additionally inject noise before the nonlinearity, and at the hidden unit activations. We show that a wide variety...

متن کامل

Low Dropout Based Noise Minimization of Active Mode Power Gated Circuit

Power gating technique reduces leakage power in the circuit. However, power gating leads to large voltage fluctuation on the power rail during power gating mode to active mode due to the package inductance in the Printed Circuit Board. This voltage fluctuation may cause unwanted transitions in neighboring circuits. In this work, a power gating architecture is developed for minimizing power in a...

متن کامل

The Role of Information Complexity and Randomization in Representation Learning

A grand challenge in representation learning is to learn the different explanatory factors of variation behind the high dimensional data. Encoder models are often determined to optimize performance on training data when the real objective is to generalize well to unseen data. Although there is enough numerical evidence suggesting that noise injection (during training) at the representation leve...

متن کامل

Follow the Leader with Dropout Perturbations

We consider online prediction with expert advice. Over the course of many trials, the goal of the learning algorithm is to achieve small additional loss (i.e. regret) compared to the loss of the best from a set of K experts. The two most popular algorithms are Hedge/Weighted Majority and Follow the Perturbed Leader (FPL). The latter algorithm first perturbs the loss of each expert by independen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1611.01353  شماره 

صفحات  -

تاریخ انتشار 2016